Applied Metaphors: Learning TRIZ, Complexity, Data/Stats/ML using Metaphors
  1. Counts

Alert - I have split up this Huge website into smaller ones. Please check out the new site URLs on the Home page for the latest course content. This website will not be updated anymore. Thanks for your patience and support! 🙏

  • Teaching
    • Data Viz and Analytics
      • Tools
        • Introduction to R and RStudio
        • Introduction to Radiant
        • Introduction to Orange
      • Descriptive Analytics
        • Data
        • Graphs
        • Summaries
        • Counts
        • Quantities
        • Groups
        • Densities
        • Groups and Densities
        • Change
        • Proportions
        • Parts of a Whole
        • Evolution and Flow
        • Ratings and Rankings
        • Surveys
        • Time
        • Space
        • Networks
        • Experiments
        • Miscellaneous Graphing Tools, and References
      • Statistical Inference
        • 🧭 Basics of Statistical Inference
        • 🎲 Samples, Populations, Statistics and Inference
        • Basics of Randomization Tests
        • 🃏 Inference for a Single Mean
        • 🃏 Inference for Two Independent Means
        • 🃏 Inference for Comparing Two Paired Means
        • Comparing Multiple Means with ANOVA
        • Inference for Correlation
        • 🃏 Testing a Single Proportion
        • 🃏 Inference Test for Two Proportions
      • Inferential Modelling
        • Modelling with Linear Regression
        • Modelling with Logistic Regression
        • 🕔 Modelling and Predicting Time Series
      • Predictive Modelling
        • 🐉 Intro to Orange
        • ML - Regression
        • ML - Classification
        • ML - Clustering
      • Prescriptive Modelling
        • 📐 Intro to Linear Programming
        • 💭 The Simplex Method - Intuitively
        • 📅 The Simplex Method - In Excel
      • Workflow
        • Facing the Abyss
        • I Publish, therefore I Am
      • Using AI in Analytics
        • Case Studies
          • Demo:Product Packaging and Elderly People
          • Ikea Furniture
          • Movie Profits
          • Gender at the Work Place
          • Heptathlon
          • School Scores
          • Children's Games
          • Valentine’s Day Spending
          • Women Live Longer?
          • Hearing Loss in Children
          • California Transit Payments
          • Seaweed Nutrients
          • Coffee Flavours
          • Legionnaire’s Disease in the USA
          • Antarctic Sea ice
          • William Farr's Observations on Cholera in London
      • TRIZ for Problem Solvers
        • I am Water
        • I am What I yam
        • Birds of Different Feathers
        • I Connect therefore I am
        • I Think, Therefore I am
        • The Art of Parallel Thinking
        • A Year of Metaphoric Thinking
        • TRIZ - Problems and Contradictions
        • TRIZ - The Unreasonable Effectiveness of Available Resources
        • TRIZ - The Ideal Final Result
        • TRIZ - A Contradictory Language
        • TRIZ - The Contradiction Matrix Workflow
        • TRIZ - The Laws of Evolution
        • TRIZ - Substance Field Analysis, and ARIZ
      • Math Models for Creative Coders
        • Maths Basics
          • Vectors
          • Matrix Algebra Whirlwind Tour
          • content/courses/MathModelsDesign/Modules/05-Maths/70-MultiDimensionGeometry/index.qmd
        • Tech
          • Tools and Installation
          • Adding Libraries to p5.js
          • Using Constructor Objects in p5.js
        • Geometry
          • Circles
          • Complex Numbers
          • Fractals
          • Affine Transformation Fractals
          • L-Systems
          • Kolams and Lusona
        • Media
          • Fourier Series
          • Additive Sound Synthesis
          • Making Noise Predictably
          • The Karplus-Strong Guitar Algorithm
        • AI
          • Working with Neural Nets
          • The Perceptron
          • The Multilayer Perceptron
          • MLPs and Backpropagation
          • Gradient Descent
        • Projects
          • Projects
      • Tech for Creative Education
        • 🧭 Using Idyll
        • 🧭 Using Apparatus
        • 🧭 Using g9.js
      • Literary Jukebox: In Short, the World
        • Italy - Dino Buzzati
        • France - Guy de Maupassant
        • Japan - Hisaye Yamamoto
        • Peru - Ventura Garcia Calderon
        • Russia - Maxim Gorky
        • Egypt - Alifa Rifaat
        • Brazil - Clarice Lispector
        • England - V S Pritchett
        • Russia - Ivan Bunin
        • Czechia - Milan Kundera
        • Sweden - Lars Gustaffsson
        • Canada - John Cheever
        • Ireland - William Trevor
        • USA - Raymond Carver
        • Italy - Primo Levi
        • India - Ruth Prawer Jhabvala
        • USA - Carson McCullers
        • Zimbabwe - Petina Gappah
        • India - Bharati Mukherjee
        • USA - Lucia Berlin
        • USA - Grace Paley
        • England - Angela Carter
        • USA - Kurt Vonnegut
        • Spain-Merce Rodoreda
        • Israel - Ruth Calderon
        • Israel - Etgar Keret
    • Posts
    • Blogs and Talks

    On this page

    • What graphs will we see today?
    • What kind of Data Variables will we choose?
    • Inspiration
    • How do these Chart(s) Work?
    • Plotting a Bar Chart
    • Dataset: Banned Books in the USA
      • Examine the Data
      • Data Dictionary
      • Research Questions
      • What is the Story Here?
    • Your Turn
    • Wait, But Why?
    • Readings

    Counts

    Happy Families are All Alike

    Qual Variables
    Bar Charts
    Column Charts
    Published

    April 16, 2024

    Modified

    July 29, 2025

    Abstract
    Visualizing Single Qual Variables

    What graphs will we see today?

    Variable #1 Variable #2 Chart Names Chart Shape
    Qual None Bar Chart

    What kind of Data Variables will we choose?

    No Pronoun Answer Variable/Scale Example What Operations?
    3 How, What Kind, What Sort A Manner / Method, Type or Attribute from a list, with list items in some " order" ( e.g. good, better, improved, best..) Qualitative/Ordinal Socioeconomic status (Low income, Middle income, High income),Education level (HighSchool, BS, MS, PhD),Satisfaction rating(Very much Dislike, Dislike, Neutral, Like, Very Much Like) Median,Percentile

    Inspiration

    Figure 1: Capital Cities

    How much does the (financial) capital of a country contribute to its GDP? Which would be India’s city? What would be the reduction in percentage?

    And these Germans are crazy.(Toc, toc, toc.toc!)

    How do these Chart(s) Work?

    Bar are used to show “counts” and “tallies” with respect to Qual variables. For instance, in a survey, how many people vs Gender? In a Target Audience survey on Weekly Consumption, how many low, medium, or high expenditure people?

    Each Qual variable potentially has many levels as we saw in the Nature of Data. For instance, in the above example on Weekly Expenditure, low, medium and high were levels for the Qual variable Expenditure. Bar charts perform internal counts for each level of the Qual variable under consideration. The Bar Plot is then a set of disjoint bars representing these counts; see the icon above, and then that for histograms!! The X-axis is the set of levels in the Qual variable, and the Y-axis represents the counts for each level.

    Bar Charts and Column Charts

    And Column charts just plot numbers over categories. No internal counting. As you can see in the Figure 1 above.

    Though in many places, these two names are used interchangeably! But be aware of what the tool may be doing!

    Plotting a Bar Chart

    • Using Orange
    • Using RAWgraphs
    • Using DataWrapper

    The Bar Plot widget in Orange is described here. https://orangedatamining.com/widget-catalog/visualize/barplot/

    And download the Bar Chart workflow file for this data:

    https://academy.datawrapper.de/category/74-bar-charts

    Dataset: Banned Books in the USA

    Here is a dataset from Jeremy Singer-Vine’s blog, Data Is Plural. This is a list of all books banned in schools across the US.

    Download this data to your machine and use it in Orange.

    Examine the Data

    Figure 2: Banned Books Data Table
    Figure 3: Banned Books Data Summary

    Figure 2 states that we have 1586 rows, 7 columns. So 1586 banned books are on this list! 🙀 🙀 🙀

    The Figure 3 already has a thumbnail-like bar chart. We will still make a “proper” one with the appropriate widget.

    Warning

    In the workflow below, note how it is still the Distributions widget that gives the Bar Chart. This is unfortunate, since we have been at pains to state how a Bar Chart and the Histogram deal with different types of variables (Qual and Quant respectively). Just one of those things we need to get used to!!

    Data Dictionary

    Quantitative Data
    • Date of Challenge: Date the book was (selected to be?) banned
    Qualitative Data
    • Author: (text) Meta Data. Can be treated as Qual
    • Title: (text) Meta Data. Can be treated as Qual
    • State: (text) Qual factor
    • District: (text) Qual factor
    • Type of Ban: (text) Qual factor
    • Origin of Challenge: (text) Who requested the Ban?

    How many levels in each?? Find out in Orange!!

    Research Questions

    Note

    Q1. Which is the US state that bans the most? Which state is least involved in banning books? What can you say of the “geography of book banning” based on your understanding of the US of A? 😀

    Figure 4: Banned Books Count by State
    Note

    Q2. Create Bar charts of the count of banned books by Reason for Banning!!

    Try!!

    What is the Story Here?

    • Figure 4 says that Texas is the worst at book banning!
    • Texas, Florida, Oklahoma, Kansas, Indiana,..are next in line
    • Is there a “Bible Belt” story here?
    Figure 5: Bible Belt
    • And what, Californians are too busy making money to care about book-banning!!! The state does not even show up in the chart! 😀

    • What does the second bar chart say?

    Your Turn

    1. AiRbnb Price Data on the French Riviera:
    1. Apartment price vs ground living area:
    1. Fertility: This rather large and interesting Fertility related dataset from https://vincentarelbundock.github.io/Rdatasets/csv/AER/Fertility.csv

    Wait, But Why?

    • Always count your chickens count your data before you model or infer!
    • Counts first give you an absolute sense of how much data you have.
    • Counts by different Qual variables give you a sense of the combinations you have in your data: \((Male/Female) * (Income-Status) * (Old/Young) * (Urban/Rural)\) (Say 2 * 3 * 2 * 2 = 24 combinations of data)
    • Counts then give an idea whether your data is lop-sided: do you have too many observations of one category(level) and too few of another category(level) in a given Qual variable?
    • Balance is important in order to draw decent inferences
    • And for ML algorithms, to train them properly.
    • Since the X-axis in bar charts is Qualitative (the bars don’t touch, remember!) it is possible to sort the bars at will, based on the levels within the Qualitative variables. See the approx Zipf’s Law distribution for the English alphabet below:
    Figure 6: Zipf’s Law

    In Figure 6, the letters of the alphabet are “levels” within a Qualitative variable, and these levels have been sorted based on the frequency or count!

    Readings

    Back to top

    License: CC BY-SA 2.0

    Website made with ❤️ and Quarto, by Arvind V.

    Hosted by Netlify .